PDF articles metadata harvester

نویسنده

  • Leon Andretti Abdillah
چکیده

Scientific journals are very important in recording the finding from researchers around the world. The recent media to disseminate scientific journals is PDF. On scheme to find the scientific journals over the internet is via metadata. Metadata stores information about article summary. Embedding metadata into PDF of scientific article will grant the consistency of metadata readness. Harvesting the metadata from scientific journal is very interesting field at the moment. This paper will discuss about scientific journal metadata harvesters involving XMP.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The CARL metadata harvester and search service

Purpose – To explain the background, functionality, and content of the CARL metadata harvester and search service, http://carl-abrc-oai.lib.sfu.ca/, and to outline plans for improving the service. Design/methodology/approach – This case study employs simple statistical analyses to a set of harvested metadata. Findings – This paper documents the use of unqualified Dublin Core (uDC) elements in t...

متن کامل

Genre Classification in Automated Ingest and Appraisal Metadata

Metadata creation is a crucial aspect of the ingest of digital materials into digital libraries. Metadata needed to document and manage digital materials are extensive and manual creation of them expensive. The Digital Curation Centre (DCC) has undertaken research to automate this process for some classes of digital material. We have segmented the problem and this paper discusses results in gen...

متن کامل

SciPDFindexer: Distributed Information Retrieval system using MapReduce

Indexing allows the conversion of raw document collections into easily searchable formats. Bigger scale indexing poses some challenges in terms of efficiently distributing indexing computation on a cluster of nodes. MapReduce framework promises to be an effective tool for parallelizing such tasks as inverted index construction. We propose SciPDFindexer, a distributed information retrieval syste...

متن کامل

A Novel Parallel Architecture Design of Information Retrieval System for Scientific Papers

Indexing allows converting raw document collection into easily searchable representation. Bigger scale indexing poses some challenges such as how to distribute indexing computation efficiently on a cluster of nodes. MapReduce framework can be an effective tool for parallelizing such tasks as inverted index construction. We propose SciPDFindexer, distributed information retrieval system for scie...

متن کامل

Retrieving Metadata for Your Local Scholarly Papers

We present a novel approach to retrieve metadata to scholarly papers stored locally as PDF files. A fingerprint is produced from the PDF fulltext to query an online metadata repository. The returned results are matched back to identify the correct metadata entry. These metadata can then be stored in the PDF itself, indexed for a desktop search engine, and collected in a user‟s or community‟s bi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1301.6591  شماره 

صفحات  -

تاریخ انتشار 2013